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Abstract —In this paper, we consider the problem of sensor 
selection for parameter estimation with correlated measurement 
noise. We seek optimal sensor activations by formulating an 
optimization problem, in which the estimation error, given by the 
trace of the inverse of the Bayesian Fisher information matrix, 
is minimized subject to energy constraints. Fisher information 
has been widely nsed as an effective sensor selection criterion. 
However, existing information-based sensor selection methods are 
limited to the case of uncorrelated noise or weakly correlated 
noise due to the use of approximate metrics. By contrast, here 
we derive the closed form of the Fisher information matrix with 
respect to sensor selection variables that is valid for any arbitrary 
noise correlation regime, and develop both a convex relaxation 
approach and a greedy algorithm to find near-optimal solutions. 
We further extend our framework of sensor selection to solve 
the problem of sensor scheduhng, where a greedy algorithm is 
proposed to determine non-myopic (multi-time step ahead) sensor 
schedules. Lastly, numerical results are provided to Illustrate the 
effectiveness of our approach, and to reveal the effect of noise 
correlation on estimation performance. 

Index Terms —Sensor selection, sensor scheduling, parameter 
estimation, correlated noise, convex relaxation. 


I. Introduction 

W IRELESS sensor networks consisting of a large num¬ 
ber of spatially distributed sensors have been widely 
used for environmental monitoring, source localization, and 
target tracking llj- Among the aforementioned applica¬ 
tions, sensors observe an unknown parameter or state of 
interest and transmit their measurements to a fusion center, 
which then determines the global estimate. However, due to 
the constraints on the communication bandwidth and sensor 
battery life, it may not be desirable to have all the sensors 
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report their measurements at all time instants. Therefore, the 
problem of sensor selection/scheduling arises, which aims 
to strike a balance between estimation accuracy and sensor 
activations over space and/or time. The importance of sensor 
selection has been discussed extensively in the context of 
various applications, such as target tracking Q, bit allocation 
0, field monitoring 0. 0. optimal control Q, power 
allocation 0,®, optimal experiment design pT[ , and leader 
selection in consensus networks G3- 

In this paper, we focus on the problem of sensor selec¬ 
tion/scheduling for parameter estimation similar to 1I3-II5), 
but with a key difference in that the measurement noise is 
correlated in the problem formulation. In flT) , the sensor 
selection problem was elegantly formulated under linear mea¬ 
surement models, and solved via convex optimization. In p^ , 
the problem of sensor selection was generalized to nonlinear 
measurement models by using the Cramer-Rao bound as the 
sensor selection criterion. In p^ , a particular class of sensor 
selection problems were transformed into the problem of 
leader selection in dynamical networks. In |15|, the problem 
of non-myopic scheduling that determined sensor activations 
over multiple future time steps was addressed for nonlinear 
filtering with quantized measurements. 

In the existing literature pg-pg, the study of sensor 
selection/scheduling problems hinges on the assumption of 
uncorrelated measurement noise, which implies that sensor 
observations are conditionally independent given the underly¬ 
ing parameter. Due to conditional independence, each mea¬ 
surement contributes to Eisher information (equivalently, in¬ 
verse of the Cramer-Rao bound on the error covariance matrix) 
in an additive manner pg. Accordingly, Eisher information 
becomes a linear function with respect to the sensor selection 
variables (which characterize the subset of sensors we select), 
and thus the resulting selection problem can be efficiently 
handled via convex optimization p3), p4). However, the 
sensed data is often corrupted by correlated noise due to the 
nature of the monitored physical environment PD- Therefore, 
development of sensor selection schemes for correlated mea¬ 
surements is a critical task. 

Recently, it has been shown in P8)-® that the presence 
of correlated noise makes optimal sensor selection/scheduling 
problems more challenging, since Eisher information is no 
longer a linear function with respect to the selection variables. 
In p8)-@, the problem of sensor selection with correlated 
noise was formulated so as to minimize an approximate 
expression of the estimation error subject to an energy con- 
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stxaint or to minimize the energy consumption subject to an 
approximate estimation constraint. In pT) , a reformulation 
of the multi-step Kalman filter was introduced to schedule 
sensors for linear dynamical systems with correlated noise. 

Different from (T8)-|2T), here we derive the closed form ex¬ 
pression of the estimation error with respect to sensor selection 
variables under correlated measurement noise, which is valid 
for any arbitrary noise correlation matrix. This expression is 
optimized via a convex relaxation method to determine the 
optimal sensor selection scheme. We also propose a greedy 
algorithm to solve the corresponding sensor selection problem, 
where we show that when an inactive sensor is made active, 
the increase in Fisher information yields an information gain 
in terms of a rank-one matrix. The proposed sensor selection 
framework yields a more accurate sensor selection scheme 
than those presented in ifTSl-pOl, because the schemes of 
GD -p0| consider an approximate formulation where the 
noise covariance matrix is assumed to be independent of the 
sensor selection variables. We further demonstrate that the 
prior formulations for sensor selection are valid only when 
measurement noises are weakly correlated. In this scenario, 
maximization of the trace of the Fisher information matrix 
used in |2g is equivalent to the problem of maximizing a 
convex quadratic function over a bounded polyhedron. The 
resutling problem structure enables the use of optimization 
methods with reduced computational complexity. 

Compared to we adopt the recursive Fisher information 
to measure the estimation performance of sensor scheduling. 
However, for non-myopic (multi-time step ahead) schedules, 
the Fisher information matrices at consecutive time steps are 
coupled with each other. Due to coupling, expressing the 
Fisher information matrices in a closed form is intractable. 
Therefore, we propose a greedy algorithm to seek non-myopic 
sensor schedules subject to cumulative and individual energy 
constraints. Numerical results show that our approach yields 
a better estimation performance than that of pi] for state 
tracking. 

In a preliminary version of this paper p2) , we studied the 
problem of sensor selection using the same framework as 
in Compared to p^ , we have the following new 

contributions in this paper. 

• We propose a more general but tractable sensor selection 
framework that is valid for an arbitrary noise correlation 
matrix, and present a suite of efficient optimization algo¬ 
rithms. 

• We reveal drawbacks of the existing formulations in p8)- 
pO) for sensor selection, and demonstrate their validity 
in only the weak noise correlation regime. 

• We extend the proposed sensor selection approach to 
address the problem of non-myopic sensor scheduling, 
where the length of the time horizon and energy con¬ 
straints on individual sensors are taken into account. 

The rest of the paper is organized as follows. In Section]^ 
we formulate the problem of sensor selection with correlated 
noise. In Section]^ we present a convex relaxation approach 
and a greedy algorithm to solve the problem of sensor selection 
with an arbitrary noise correlation matrix. In SectionlTV] 
we present sensor selection approach with weakly correlated 


noise. In Section|V] we extend our framework to solve the 
problem of non-myopic sensor scheduling. In Section]^ we 
provide numerical results to illustrate the effectiveness of our 
proposed methods. In Section [Vn| we summarize our work 
and discuss future research directions. 


II. Problem Formulation 

We wish to estimate a random vector x G M" with a 
Gaussian prior probability density function (PDF) 
Observations of x from m sensors are corrupted by correlated 
measurement noise. To strike a balance between estimation 
accuracy and sensor activations, we formulate the problem of 
sensor selection, where the estimation error is minimized sub¬ 
ject to a constraint on the total number of sensor activations. 

Consider a linear system 


y = Hx-|-v, (1) 

where y G K™ is the measurement vector whose mth entry 
corresponds to a scalar observation from the mth sensor, 
H G ]^mxn jg jjjg observation matrix, and v G M™ is the 
measurement noise vector that follows a Gaussian distribu¬ 
tion with zero mean and an invertible covariance matrix R. 
We assume that x and v are mutually independent random 
variables, and the noise covariance matrix is positive definite 
and thus invertible. We note that the noise covariance matrix 
is not restricted to being diagonal, so that the measurement 
noise could be correlated among the sensors. We also note 
that in practice, the first two moments of x can be learnt from 
a parametric covariance model, such as a power exponential 
model together with a training dataset of the parameter pT| . 

The task of sensor selection is to determine the best subset 
of sensors to activate in order to minimize the estimation 
error, subject to a constraint on the number of activations. We 
introduce a sensor selection vector to represent the activation 
scheme 


WiG{0,l}, (2) 

where Wi indicates whether or not the ith sensor is selected. 
For example, if the ith sensor reports a measurement then 
Wi = 1, otherwise Wi = 0. In other words, the active sensor 
measurements can be compactly expressed as 

Yw = = ^ujHx -f (3) 

where y^, G is the vector of measurements of selected 

sensors, |jw||i is the £i-norm of w which yields the total 
number of sensor activations, G {0, is a 

submatrix of diag(w) after all rows corresponding to the 
unselected sensors have been removed, and diag(w) is a 
diagonal matrix whose diagonal entries are given by w. We 
note that and w are linked as below 

and = diag(w), (4) 

where denotes an identity matrix with dimension ||w||i. 
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A. Minimum mean-squared estimation error 

We employ the minimum mean square error (MMSE) esti¬ 
mator to estimate the unknown parameter under the Bayesian 
setup. It is worth mentioning that the use of the Bayesian 
estimation framework ensures the validity of parameter esti¬ 
mation for an underdetermined system, in which the number 
of selected sensors is less than the dimension of the parameter 
to be estimated, namely, ||w||i < n. 

Given the Gaussian linear measurement model o, the prior 
distribution of the unknown parameter x and the active sensor 
measurements the error covariance matrix of the MMSE 
estimate of x is given by | |24| Theorem 12.1] 

P„= (5) 

where the matrix comprises rows of H for the active 

sensors, and Rt^, denotes the submatrix of R after all rows 
and columns corresponding to the inactive sensors have been 
removed, i.e., 

R^, = (6) 

It is clear from (|^ that due to the presence of the prior 
knowledge about S, the MSE matrix P^, is always well 
defined, even if the matrix is not invertible 

for an underdetermined system with ||w||i < n. 

It is known from m that the MSE matrix P^, is the 
inverse of the Bayesian Eisher information matrix J^, under 
the linear Gaussian measurement model with a Gaussian prior 
distribution. We thus obtain 



= (7) 

where the second term is related to the sensor selection 
scheme. In this paper, for clarity of presentation, we choose 
to work with rather than P^j. 

It is clear from (|^ and Q that the dependence of J^, on w 
is through €>uj. This dependency does not lend itself to easy 
optimization of scalar-valued functions of with respect to 
w. In what follows, we will rewrite J^, as an explicit function 
of the selection vector w. 

B. Fisher information J^, as an explicit function of w 

The key idea of expressing Q as an explicit function of w 
is to replace with w based on their relationship given by 
Q- Consider a decomposition of the noise covariance matrix 

R = al -f S, (8) 

where a positive scalar a is chosen such that the matrix S is 
positive definite, and I is the identity matrix. We remark that 
the decomposition given in ([^ is readily obtained through 
an eigenvalue decomposition of the positive definite matrix 
R, and it helps us in deriving the closed form of the Eisher 
information matrix with respect to w. 

Substituting © into (|^, we obtain 

(9) 


where the last equality holds due to Q. 

Using (|^, we can rewrite a part of the second term on the 
right hand side of Q as 

—(hT(„r 

= diag(w))-^S-\ (10) 

where step (1) is obtained from the matrix inversion lemm^ 
and step (2) holds due to Q. 

Substituting ( [TOl i into Q, the Eisher information matrix can 
be expressed as 

=S-i -f H^S-^H 

- -f a-^ diag(w))-iS-iH. (11) 

It is clear from ( [TT] ) that the decomposition of R in ([^, 
together with equations (|9ll-([T0li, allows us to make explicit 
and isolate the dependence of on w. We also remark that 
the positive scalar a and positive definite matrix S can be 
arbitrarily chosen, and have no effect on the performance of 
the sensor selection algorithms that will be proposed later on. 

C. Formulation of the optimal sensor selection problem 

We now state the main optimization problem considered in 
this work as 

minimize tr(J“^) 

W 

subject to l^w < s, (PO) 

wG{o,ir, 

where G K" is given by ( [TT] l, and s < m is a prescribed 
energy budget given by the maximum number of sensors to be 
activated. We recall that n is the dimension of the parameter 
to be estimated, and m is the number of sensors. 

We note that ( |P0| | is a nonconvex optimization problem due 
to the presence of Boolean selection variables. Moreover, if 
we drop the source statistics S from the MSE matrix (0 
and impose the assumption s > n, the proposed formulation 
(PO) is then applicable for sensor selection in a non-Bayesian 
framework, where the unknown parameter is estimated through 
the best linear unbiased estimator | [24l . 

In what follows, we discuss two special cases for the for¬ 
mulations of the sensor selection problem under two different 
structures of the noise covariance matrix R: a) R is diagonal, 
and b) R has small off-diagonal entries. 

D. Formulation for two special cases 

When measurement noises are uncorrelated, the noise co- 
variance matrix R becomes diagonal. Prom ^ and 0, the 

'For appropriate matrices A, B, C and D, the matrix inversion lemma 
states that (A + BCD)"' = A”' - A’^BjC-^-f DA-'Bj-iDA-i, 
which yields BjC”' + DA-^Bj-'D = A - A(A -I- BCD)-iA. 


R^, = $^(al -f = al„ -f 
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Fisher information matrix in the objective function of ( |PQ| t 
simplifies to 

diag(w)R~^ diag(w)H 

m 

( 12 ) 

where hf denotes the jth row of H, Ru denotes the ith 
diagonal entry of R, and the last equality holds due to the 
fact that 


j = 1,2,..., m. 


(13) 


It is clear from that each sensor contributes to Fisher 
information in an additive manner. As demonstrated in HD 


and 1141, the linearity of the inverse mean squared error (Fisher 


information) with respect to w enables the use of convex 
optimization to solve the problem of sensor selection. 

When measurement noises are weakly correlated (namely, 
R has small off-diagonal entries), it will be shown in Sec. IV 
that the Fisher information matrix can be approximately ex 
pressed as 


J„, :— S 


-1 


H^(ww^oR-i)H, 


(14) 


where o stands for the Hadamard (elementwise) product. The 
problem of sensor selection with weakly correlated noise 
becomes 

minimize tr -f H^(ww^ o R~i)H) ^ 
subject to l^w < s, (PI) 

w e {0,1}”^. 

Compared to the generalized formulation ( |P0| |, the objective 
function of ( |Pl| l is convex with respect to the rank-one matrix 
ww^. Such structure introduces computational benefits while 
solving €D- We emphasize that ( |P1| ) has been formulated in 
GD-iig for sensor selection with correlated noise, however, 
using this formulation, without acknowledging that it is only 
valid when the correlation is weak, can lead to incorrect 
sensor selection results. We elaborate on the problem of sensor 
selection with weakly correlated noise in Sec.|IV| 


III. General Case; Proposed Optimization Methods 
EOR Sensor Selection 

In this section, we present two methods to solve ( |P0| ): the 
first is based on convex relaxation techniques, and the second 
is based on a greedy algorithm. First, we show that after 
relaxing the Boolean constraints the selection problem can 
be cast as a standard semidefinite program (SDP). Given the 
solution of the relaxed we then use the randomization 
method to generate a near-optimal selection scheme. Next, we 
show that given a subset of sensors, activating a new sensor 
always improves the estimation performance. Motivated by 
this, we present a greedy algorithm that scales gracefully with 
the problem size to obtain locally optimal solutions of 


A. Convex relaxation 

Substituting the expression of Fisher information into 
problem ( |P0| l, we obtain 

minimize tr fc — (S“^ -f diag(w)) ^ b) 

w V ' ^ ^ I'lS'i 

subject to l^w < s, ^ 

w e {0,1}'", 

where for notational simplicity we have defined C := + 

H^S-iR and B S ^H. 

Problem can be equivalently transformed to | |2^ 

minimize tr(Z) 

w,Z 

subject to C — B^ (S“^ -I-diag(w)) 
l^w < s, 
w e {0,1}'", 

where Z G S" is an auxiliary variable, S" represents the set 
of n X n symmetric matrices, and the notation X ^ Y (or 
X A Y) indicates that the matrix X — Y (or Y — X) is 
positive semidefinite. The first inequality constraint in ( fTh] ) is 
obtained from 

(C-B^ (S"i-f o-Miag(w))”^B)”^ ^ Z, 

which implicitly adds the additional constraint Z ^ 0, since 
the left hand side of the above inequality is the inverse of the 
Fisher information matrix. 

We further introduce another auxiliary variable V G S" 
such that the first matrix inequality of is expressed as 

C-YyZ-\ (17) 


and 


(S' 


diag(-' 


')) 'b. 


(18) 


Note that the minimization of tr(Z) with inequalities ( [T7| ) and 
( [T^ would force the variable V to achieve its lower bound. 
In other words, problem ( fTh] ) is equivalent to the problem 
in which the inequality constraint in ( [T6] l is replaced by the 
two inequalities and Finally, employing the Schur 
complement, the inequalities ( [T7] l and ( [TSl ) can be rewritten as 
the following linear matrix inequalities (LMIs) 


C - V 
I 


^0, 


V 

B 


j-i 


a 


B^ 

diag(w) 


^0. (19) 


Substituting into the sensor selection problem 

becomes 


minimize tr (Z) 

w,Z,V 

subject to LMIs in ( [T9l l, (^20) 

l^w < s, 
w G {0,1}'". 

Problem has the form of an SDP except for the last 
Boolean constraints. As shown in one possibility is 

to relax each Boolean variable to its convex hull to obtain 
w G [0,1]'". In this case, we can choose s active sensors 
given by the first s largest entries of the solution of the relaxed 
problem, or employ a randomized rounding algorithm Gl 
Algorithms] to generate a Boolean selection vector. 
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Rather than directly relaxing Boolean selection variables 
to continuous variables, we can use semidefinite relaxation 
(SDR) ^ — referred to problems in which the relaxation of 
a rank constraint leads to an SDP — to better overcome the 
difficulties posed by the nonconvex constraints of ( |20l l. The 
Boolean constraint d on the entries of w can be enforced 
by 

diag(ww'^) = w, (21) 

where, with an abuse of notation, diag(-) returns in vector 
form the diagonal entries of its matrix argument. By intro¬ 
ducing an auxiliary variable W together with the rank-one 
constraint 

W = ww^, (22) 

the energy and Boolean constraints in ( |20l l can be expressed 
as 


tr(W) < s, diag(W) = w. 

After relaxing the (nonconvex) rank-one constraint 
W ^ ww^, we reach the SDP 


(23) 

to 


minimize 

w,W.Z,V 

subject to 


tr(Z) 

LMls in 
tr(W) < s, 
diag(W) = w 

W w 
1 


(24) 


^ 0 , 


where the last inequality is derived through the application of 
a Schur complement to W A ww^. 

We can use an interior-point algorithm to solve the SDP 
( |24| l. In practice, if the dimension of the unknown parameter 
vector is much less than the number of sensors, the computa¬ 
tional complexity of SDP is roughly given by Oirn^'^) 

Once the SDP ( |24l l is solved, we employ a randomization 
method to generate a near-optimal sensor selection scheme, 
where the effectiveness of the randomization method has 
been shown in our extensive numerical experiments. We refer 
the readers to HZ) for more details on the motivation and 
benefits of randomization used in SDR. The aforementioned 
procedure is summarized in Algorithm 1, which includes the 
randomization procedure described in Algorithm 2. 


Algorithm 1 SDR with randomization for sensor selection 

Require: prior information S, R = al + S as in 
observation matrix H and energy budget s 
1: solve the SDP ( |24| l and obtain solution (w,W) 

2: call Algorithm 2 for Boolean solution. 


B. Greedy algorithm 

We begin by showing in Proposition[2that even in the pres¬ 
ence of correlated measurement noise, the Fisher information 
increases if an inactive sensor is made active. 

Proposition 1: If w and w represent two sensor selection 
vectors, where Wi = Wi for i G {1,2,... ,m} \ {j}, wj = 


Algorithm 2 Randomization method 



Require: solution pair (w, W) from the SDP ( |24l i 
1 : for ( = 1, 2,..., A do 

2: pick a random number ~ A/'(w, W — ww^) 

3: map to a sub-optimal sensor selection scheme 



1 

0 otherwise 


where is the jth element of and 

denotes the sth largest entry of 

4: end for 

5: choose a vector in which yields the smallest 

objective value of S- 


0 and lij = 1, then the resulting Fisher information matrix 
satisfies 3^ h More precisely. 




^ w 


Cj <y.j cy.j 


(25) 


and 




CjCxJJjaj ^ ^ 

1 —t” Cj J yj CK j 


(26) 


where Cj is a positive scalar given by 


and 


I ~^ otherwise, 

r hj _ w = 0 

{ — hj otherwise. 


(27) 


(28) 


In (|27]i-(|28]l, Rjj is the jth diagonal entry of R, Vj represents 
the covariance vector between the measurement noise of the 
jth sensor and that of the active sensors in w, hj is the jth 
row of H, and R^, are given by 0 and (|^, respectively. 
Proof: See Appen dix[A| ■ 

It is clear from (|25|) that when an inactive sensor is made 
active, the increase in Fisher information leads to an infor¬ 
mation gain in terms of the rank-one matrix given by ( [25] l. 
Such a phenomenon was also discovered in the calculation of 
sensor utility for adaptive signal estimation | |29t and leader 
selection in stochastically forced consensus networks Gl- 
Since activating a new sensor does not degrade the estimation 
performance, the inequality (energy) constraint in ( |P0| ) can be 
reformulated as an equality constraint. 

In a greedy algorithm, we iteratively select a new sensor 
which gives the largest performance improvement until the en¬ 
ergy constraint is satisfied with equality. The greedy algorithm 
is attractive due to its simplicity, and has been employed in a 
variety of applications p9) , 1^ . In particular, a greedy 
algorithm was proposed in |30| for sensor selection under the 
assumption of uncorrelated measurement noise. We generalize 
the framework of pO) by taking into account noise correlation. 
Clearly, in each iteration of the greedy algorithm, the newly 
activated sensor is the one that maximizes the performance 
improvement characterized by tr(J“^) — tr(J^^) in ( |2^ . We 
summarize the greedy algorithm in Algorithm 3. 
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Algorithm 3 Greedy algorithm for sensor selection 

Require: w = 0, 1 = {1,2,..., m} and J^, = 

1 : for Z = 1,2,..., s do 

2: given w, enumerate all the inactive sensors in I to 

determine j G I such that tr(J“^) — tr(J^^) in ( |2^ 
is maximized 

3: update w by setting wj = 1, and update J^, by adding 

CjCtjalj in ( |25] l 
4: remove j from I. 

5: end for 


In Step 2 of Algorithm 3, we search 0{m) sensors to achieve 
the largest performance improvement. In ( |26ll, th e computation 
of incurs a complexity of ]3l[. Since Algo¬ 

rithm 3 terminates after s iterations, its overall complexity 
is given by 0{sm + where at each iteration, the 

calculation of is independent of the search for the new 
active sensor. If the dimension of x is much less than the 
number of sensors, the complexity of Algorithm 3 reduces to 
0{sm). Our extensive numerical experiments show that the 
greedy algorithm is able to yield good locally optimal sensor 
selection schemes. 

IV. Special Case: Sensor Selection with Weak 
Noise Correlation 

In this section, we show that the existing sensor selection 
model in |T8)-|[20) is invalid for an arbitrary noise covariance 
matrix. We establish that in contrast to the approach proposed 
in this paper, the existing model in |[T8|-p0) is only valid 
when measurement noises are weakly correlated. In this sce¬ 
nario, the proposed sensor selection problem given by 
would simplify to (0. Moreover, if the trace of the Fisher 
information matrix (also known as information gain defined 
in p0| ) is adopted as the performance measure for sensor 
selection, we show that the resulting optimization problem can 
be cast as a special problem of maximizing a convex quadratic 
function over a bounded polyhedron. 

A. Drawbacks of existing formulation 

In |T8)-@, several variations of sensor selection problems 
with correlated noise have been studied, based on whether the 
quantity to be estimated is a random parameter or a random 
process, and whether the cost function is energy or estimation 
error. The common feature in GD-iiig is that the information 
matrix was approximated by (HI; we repeat equation ( [T4l i here 
for convenience 

Ju, = S-i-1-H^(ww^ oR-i)H. (29) 

Compared to our formulation Q, the noise covariance matrix 
appearing in ( |29| ) is independent of the sensor selection 
variables. In fact, J^, can be thought of as Fisher information 
under the measurement model 

y = $u,Hx -f V, (30) 

where 4*^, was defined in 0- Different from 0, the noise 
from the unselected sensors is spread across the selected 


sensors. As a result, the measurement model yields 
yi = Vi if the jth sensor is inactive. This contradicts the fact 
that an inactive sensor should keep silent and thus have no 
effect on the estimation task. 

The Fisher information in ( |29l l can also be interpreted as 
|[T8) Sec. 3] 

= S-i -f ^ %h,hj, 
i,jes 

= s-i + (31) 


where S is the set of selected sensors, and Rij denotes the 
(i, j)th entry of R~^. In ( [3T] i, R“^ is computed first and then 
truncated according to the sensor selection scheme. This is 
an incorrect way of modeling the noise covariance matrix for 
active sensors, since the matrix R should be truncated first 
and then inverted as demonstrated in 0 - 

Both of the interpretations ( [30l l and 0) indicate that the 
existing formulation in l[T§-|2g is inaccurate for modeling 
the problem of sensor selection with correlated noise. A 
natural question that arises from the preceding discussion is 
whether there exist a condition that ensures the validity of the 
Fisher information matrix ( [29] ) as presented in We 

will show in the next section that the formulation reported 
in -|[20[ becomes valid only when sensor selection is 
restricted to the weak noise correlation regime. 


B. Validity of existing formulation: weak correlation 

We consider the scenario of weakly correlated noise, in 
which the noise covariance matrix R has small off-diagonal 
entries, namely, noises are weakly correlated across the sen¬ 
sors. For ease of representation, we express the noise covari¬ 
ance matrix as 


R = A-f eT, 


(32) 


where A is a diagonal matrix which consists of the diagonal 
entries of R, eT is a symmetric matrix whose diagonal entries 
are zero and off-diagonal entries correspond to those of R, 
the parameter e is introduced to govern the strength of noise 
correlation across the sensors, and A and T are independent 
of e. Clearly, the covariance of weakly correlated noises can 
be described by ( |3^ for some small value of e since Y is 
e-independent. As e —0, the off-diagonal entries of R are 
forced to go to zero. 

Proposition]^ below shows that the correct expression 0 of 
Fisher information is equal to the expression ( |29l l, as presented 
in 118 1-||20), up to first order in e as e —0. 

Proposition 2: If measurement noises are weakly correlated 
and R = A -f eT, then the Fisher information matrix 0 can 
be expressed as 


Jtu — Jtu + as e — 0, 

where is given by ( |29] l. 

Proof: See AppendixjB] ■ 

It is clear from Proposition]^ that ( |P1[ ) is valid only when 
the noise correlation is weak. Proceeding with the same logic 
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as in the previous section for the introduction of constraints 
(|22|t-([23|t, we relax ( |P1[ ) to the SDP 


minimize 

w,W.Z 

subject to 


tr(Z) 

S-i+H^(WoR-i)H I' 

I Z 

tr(W) < s, diag(W) = w, 

W 


^ 0 , 


(33) 


1 


bO, 


where Z € S” is an auxiliary optimization variable. Given 
the solution pair (w, W) of problem ( |33| ), we can use the 
randomization method in Algorithm 2 to construct a near- 
optimal sensor selection scheme. The computational com¬ 
plexity of solving problem ( |33l l is close to that of solving 
the SDP ( |24l l. However, as will be evident later, the sensor 
selection problem with weakly correlated noise can be further 
simplified if the trace of the Fisher information matrix is used 
as the performance measure. In this scenario, the obtained 
problem structure enables the use of more computationally 
inexpensive algorithms, e.g., bilinear programing, to solve the 
sensor selection problem. 


C. Sensor selection by maximizing trace of Fisher information 


Instead of minimizing the estimation error, the trace of 
Fisher information (so-called T-optimality 1321) also has been 
used as a performance metric in problems of sensor selection 
120) , p3) , p4) . According to | [35l Lemma 1], the trace of 
Fisher information constitutes a lower bound to the trace of 
error covariance matrix given by in Q. That is. 


tr(J» ) > 




(34) 


Motivated by ( |34| and the generalized information gain used in 
| [20) , we propose to minimize the lower bound of the objective 
function in 0, which leads to the problem 

maximize tr -f H^(ww^ o R^^)H) 
subject to l^w < s, (P2) 

w e {0,1}™. 


It is worth mentioning that the sensor selection scheme 
obtained from ( |P2| l may not be optimal in the MMSE sense. 
However, the trace operator is linear and introduces compu¬ 
tational benefits in optimization. Reference pO) has shown 
that is not convex even if Boolean selection variables 
are relaxed. However, there is no theoretical justification and 
analysis provided in | |20) on the problem structure. In what 
follows, we demonstrate that the Boolean constraint in 
can be replaced by its convex hull w S [0,1]™ without loss 
of performance, to obtain an equivalent optimization problem. 

Proposition 3: ( |P2| i is equivalent to 

maximize 

W 

subject to l^w < s, (35) 

w e [0,1]™, 

where Jl is a positive semidefinite matrix given by A(R“^ ® 
I„)A^, 0 denotes the Kronecker product, A € 
a block-diagonal matrix whose diagonal blocks are given by 


and hf denotes the ith row of the measurement 

matrix H. 

Proof: See Appendix[Cl ■ 

It is clear from Propositionj^that ( |P2| i eventually approaches 
the problem of maximizing a convex quadratic function over 
a bounded polyhedron. It is known that finding a globally 
optimal solution of ( [35] l is NP-hard. Therefore, we resort to 
local optimization methods, such as bilinear programming 
and SDR, to solve problem ( [35] l. To be specific, bilinear 
programming is a special case of alternating convex optimiza¬ 
tion, where at each iteration we solve two linear programs. 
Since bilinear programming is based on linear programming, 
it scales gracefully with problem size but with a possibility 
of only finding local optima. If we rewrite the constraints of 
problem ( |35] l as quadratic forms in w, can be further 
transformed into a nonconvex homogeneous quadratically con¬ 
strained quadratic program (QCQP), which refers to a QCQP 
without involving linear terms of optimization variables. In this 
scenario, SDR can be applied to solve the problem. Compared 
to the application of SDR in ( [33] l, the homogeneous QCQP 
leads to an SDP with a smaller problem size. We refer the 
readers to | |22l Sec. V] and | |20l Sec. V] for more details on 
the application of bilinear programming and SDR. 

V. Non-myopic Sensor Scheduling 

In this section, we extend the sensor selection framework 
with correlated noise to the problem of non-myopic sensor 
scheduling, which determines sensor activations for multiple 
future time steps. Since the Fisher information matrices at 
consecutive time steps are coupled with each other, expressing 
them in a closed form with respect to the sensor selection 
variables becomes intractable. Therefore, we employ a greedy 
algorithm to seek locally optimal solutions of the non-myopic 
sensor scheduling problem. 

Consider a discrete-time dynamical system 

xt+i = Ftxt -f ut (36) 

yt=HtXt-fvt, (37) 

where Xj S K” is the target state at time t, yt S K™ 
is the measurement vector whose ith entry corresponds to 
a scalar observation from the ith sensor at time t. Ft is 
the state transition matrix from time t to time t + 1, and 
Ht denotes the observation matrix at time t. The inputs Ut 
and Vt are white, Gaussian, zero-mean random vectors with 
covariance matrices Q and R, respectively. We note that the 
covariance matrix R may not be diagonal, since the noises 
experienced by different sensors could be spatially correlated. 
We also remark that although the dynamical system @- 
0 is assumed to be linear, it will be evident later that the 
proposed sensor scheduling framework is also applicable to 
non-linear dynamical systems. 

The PDF of the initial state Xg at time step fg is assumed 
to be Gaussian with mean xg and covariance matrix Pg, 
where xg and Pg are estimates of the initial state and error 
covariance from the previous measurements obtained using 
filtering algorithms, such as a particle filter or a Kalman filter 
0, 0. At time step fg, we aim to find the optimal sensor 
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schedule over the next r time steps Iq + 1, + 2,..., to + "C- 

Hereafter, for notational simplicity, we assume to = 0. The 
sensor schedule can be represented by a vector of binary 
variables 


W = e {0,1}™, (38) 

where W( = [wt,i,wt, 2 , ■ ■ ■ ,Wt,Tn]’^ characterizes the sensor 
schedule at time 1 < t < r. In what follows, we assume that 
T > 1. If T = 1, the non-myopic sensor scheduling problem 
reduces to the sensor selection problem for one snapshot or 
the so-called myopic scheduling problem. This case has been 
studied in the previous sections. 

In the context of state tracking (T§, 1^, the Fisher 
information matrix has the following recursive form 

Jt = (Q + + Gt (39) 

= (40) 

for f = 1,2,... ,T, where J( denotes the Fisher information 
at time t, G* denotes the part of Fisher information matrix 
which incorporates the updated measurement, and is a 
submatrix of diag(wt) where all the rows corresponding to 
the unselected sensors are removed. It is clear from ( [T0| i that 
the term involving in ( |40| can be further expressed as an 
explicit form with respect to Wt. 

Remark 1: In case of non-linear measurement models, the 
term Gt in the Fisher information matrix becomes 


where h(-) is a nonlinear measurement function, and V^tIi is 
the Jacobian matrix of h with respect to Xt- In this equation, 
the expectation with respect to Xt is commonly calculated with 
the help of the prediction state Xt := Ft_iFt _2 ■ • • FqXo ^ 


To be concrete, we approximate the PDF of Xt with 
p{yit) = ^(xt — Xt)> where S(-) is a 5-function. The matrix Gt 
is then given by 




R$^ )-l$ 


Wt ; 


(41) 


where Ht := VxTh(xt). 

We note that the Fisher information matrices at consecutive 
time steps are coupled with each other due to the recursive 
structure in Therefore, Jt is a function of all selection 
variables The recursive structure makes the closed 

form of Fisher information intractable. This is in sharp contrast 
with the problem of myopic sensor selection, where expressing 
the Fisher information matrix in a closed form is possible. 

We now pose the non-myopic sensor scheduling problem 

I ^ 

minimize - tr(J^^) 

subject to l^w < s, (42a) 

SLi < Si, i=l,2,...,m, (42b) 

WG{0,1^^ 


where 3t is determined by ([39]l-(|40li, the cumulative energy 
constraint ( |42a| i restricts the total number of activations for all 
sensors over the entire time horizon, and the individual energy 


constraint ( |42b| i implies that the ith sensor can report at most 
Si measurements over r time steps. 

To solve problem ( |42| ) in a numerically efficient manner, 
we employ a greedy algorithm that iteratively activates one 
sensor at a time until the energy constraints are satisfied with 
equality. The proposed greedy algorithm can be viewed as a 
generalization of Algorithm 3 by incorporating the length of 
the time horizon and individual energy constraints. 

We elaborate on the greedy algorithm. In the initial step, 
we assume w = 0 and split the set of indices of w into m 
subsets {Ii]fT^, where we use the entries of the set Xi to keep 
track of all the time instants at which the ith sensor is inactive. 
The set Xi is initially given by {i,i -\- m,... ,i -\- {t — l)m} 
for i = 1,2,... ,m. There exists a one-to-one correspondence 
between an index j G Xi and a time instant t G {1,2,.. .,r} 
at which the ith sensor can be scheduled, where j = i -\- {t — 
l)m. At every iteration of the greedy optimization algorithm, 
we update Xi for * = 1,2,..., m such that it only contains 
indices of zero entries of w. The quantity r — \Xi\ gives the 
number of times that the ith sensor has been used, where | • | 
denotes the cardinality of a set. The condition t — \Xi\ > Si 
indicates a violation of the individual energy constraint. Note 
that the union {Xi UX 2 U... Ulm} gives all the remaining time 
instants at which the sensors can be activated. We enumerate 
all the indices in the union to determine the index j* such 
that the objective function of ( |42l i is minimized as wj* = 1. 
We summarize the greedy algorithm for non-myopic sensor 
scheduling in Algorithm 4. 

Algorithm 4 Greedy algorithm for sensor scheduling 

Require: w = 0 and Xi = {i,i -{- m,... ,i -i- (t — l)m} for 
i = 1,2,... ,m 

1: for ( = 1,2,..., minjs, Y,T=i 

2: if T — \Xi\ > Si, then replace Xi with an empty set for 

i = 1,2,... ,m, 

3: enumerate indices of w in jli U I 2 U ... U Xm} to 

select j* such that the objective function of ( |42] i is 
minimized when Wj = 1, 

4: remove j from Xi *, where i* is given by the remainder 

of — for i* m, and i* = m if the remainder is 0. 

5: end for 


The computational complexity of Algorithm 4 is domi¬ 
nated by Step 3. Specifically, we evaluate the objective func¬ 
tion of ( |42l l using Ojrm) operations. And the computation 
of the Fisher information matrix requires a complexity of 
where 0(t) accounts for the number of recur¬ 
sions, and is the complexity of matrix inversion 

in ( |4 T] i | |3T) . We emphasize that different from Proposition[2 
expressing the closed form of the performance improvement 
in a greedy manner becomes intractable, since the Fisher 
information matrices are coupled with each other over the 
time horizon. Therefore, the computation cost of Algorithm 4 
is given by per iteration. 

For additional perspective, we compare the computational 
complexity of Algorithm4 with the method in pT) , where a 
reweighted based quadratic programming (QP) was used 
to obtain locally optimal sensor schedules under linear (or 
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linearized) dynamical systems with correlated noise. It was 
shown in that the computational complexity of QP was 
ideally given by 0(w?'^t^) for every reweighting iteration. 
We note that the computational complexity of the greedy 
algorithm increases slightly in terms of the network size by a 
factor while it decreases significantly in terms of the 

length of the time horizon by a factor r^. 


VI. Numerical Results 


In this section, we demonstrate the effectiveness of the 
proposed approach for sensor selection/scheduling with cor¬ 
related measurement noise. In our numerical examples, we 
assume that the sensors are randomly deployed in a square 
region, where each of them provides the measurement of an 
unknown parameter or state. For parameter estimation, we 
use the linear MMSE estimator Sec. 12] to estimate the 
unknown parameter. For state tracking, we use the extended 
Kalman filter p?} Sec. 13] to track the target state. 

Sensor selection for parameter estimation: We consider a 
network with m G {20, 50} sensors to estimate the vector of 
parameters x G M” with n = 2, where sensors are randomly 
deployed over a 50 x 50 lattice. The prior PDF of x is given 
by X ^ S), where fi = [10,10]^ and S = I. For 

simplicity, the row vectors of the measurement matrix H 
are chosen randomly, and independently, from the distribution 
Af{0,l/^/n) 1131. The covariance matrix of the measurement 
noise is set by an exponential model pTI 

(43) 


Rij = coy{vi,Vj) = ale 


for i,j = 1,2,... ,m, where al = 1, (3i G is the location 
of the ith sensor in the 2D plane, || -112 denotes the Euclidean 
norm, and p is the correlation parameter which governs the 
strength of spatial correlation, namely, a larger (or smaller) p 
corresponds to a weaker (or stronger) correlation. 

We choose N = 100 while performing the randomization 
method. Also, we employ an exhaustive search that enumerates 
all possible sensor selection schemes to obtain the globally 
optimal solution of The estimation performance is mea¬ 
sured through the empirical MSE, which is averaged over 1000 
numerical trials. 

In Eig.[T] we present the MSE as a function of the energy 
budget by solving ( |P0| l with correlation parameter p = 0.1. In 
Eig.[^(a) for the tractability of exhaustive search, we consider 
a small network with m — 20 sensors. We compare the 
performance of the proposed greedy algorithm and SDR with 
randomization to that of SDR without randomization and ex¬ 
haustive search. In particular, the right plots of Eig.[T]-(a) show 
the performance gaps for the obtained locally optimal solutions 
compared to the globally optimal solutions resulting from an 
exhaustive search. We observe that the SDR method with 
randomization outperforms the greedy algorithm and yields 
optimal solutions. The randomization method also significantly 
improves the performance of SDR in sensor selection. This is 
not surprising, and our numerical observations agree with the 
literature il27|, gg that demonstrate the power and utility of 
randomization in SDR. 

In Eig.[g(b), we present the MSE as a function of the energy 
budget for a relatively large network (m — 50). Similar to 
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_ Performance gap 
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Energy budget, s 


(a) 



Fig. 1 : MSE versus energy budget with correlation parameter p = 0.1. 


the results of Eig.[g(a), the SDR method with randomization 
yields the lowest estimation error. We also observe that the 
MSE ceases to decrease significantly when s > 20. This 
indicates that a subset of sensors suffices to provide satisfac¬ 
tory estimation performance, since the presence of correlation 
among sensors introduces information redundancy and makes 
observations less diverse. 

In Eig.|g we solve the problem of sensor selection with 
weak noise correlation (p = 0.5), and present the MSE as a 
function of the energy budget s G {2, 3,..., 50}. We compare 
the performance of three optimization approaches: SDR with 
randomization for solving 0, bilinear programming (BP) 
for solving and SDR with randomization for solving 

We recall that is to minimize the trace of the 
error covariance matrix and ( |P21 i is to maximize the trace of 
Eisher information. As we can see, approaches that maximize 
the trace of Eisher information yield worse estimation perfor¬ 
mance than those that minimize the estimation error. This is 
because (|P2|) ignores the contribution of prior information S 
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in sensor selection. We also note that although BP (a linear 
programming based approach) has the lowest computational 
complexity, it leads to the worst optimization performance. 



Fig. 2: MSE versus energy budget for sensor selection with weak noise 
correlation p = 0.5. 



Fig. 3 : MSE versus the strength of correlation for s G {7, 13}. 


In Fig.[^ we present the MSE as a function of the correlation 
parameter p, where to = 50 and s G {7,13}. We consider 
sensor selection schemes by using SDR with randomization 
to solve problems ( |P0| l and respectively. For comparison, 
we also present the estimation performance when all the sen¬ 
sors are selected. As demonstrated in Fig.[^ we consider two 
correlation regimes: weak correlation and strong correlation. 
We observe that in the weak correlation regime, solutions of 
both ( |P0[ ) and ( |P1[ | yield the same estimation performance. In 
the strong correlation regime, solutions of could lead to 
worse estimation performance for sensor selection. We also 
observe that the sensitivity to the strategy of sensor selection 
reduces if the strength of correlation becomes extremely large, 
e.g., p < 0.05. More interestingly, the estimation performance 
is improved as the correlation becomes stronger. This is 


because for strongly correlated noise, noise cancellation could 
be achieved by subtracting one observation from the other | |43) . 
Further if we fix the value of p, the estimation error decreases 
when the energy budget increases, and the performance gap 
between solutions of ( |P0| l and reduces. 


Sensor scheduling for state tracking 


In this example, we track a target with to = 30 sensors over 
30 time steps. We assume that the target state is a 4 x 1 vector 
xt = [xtp.Xtp.Xt^z.Xt^iY', where {xtp.Xtp) and {xt,i.,xt,i) 
denote the target location and velocity at time step t. The state 
equation ( [3^ follows a white noise acceleration model p8j 
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where A and q denote the sampling interval and the process 
noise parameter, respectively. In our simulations, we set A = 1 
and q = 0.01. The prior PDF of the initial state is assumed to 
be Gaussian with mean Xq = [1,1,0.5, 0.5]^ and covariance 
So = diag(l, 1,0.1, 0.1). The measurement equation follows 
a power attenuation model p^. 


hi{xt) = 


1 + {xt,l - A.l)^ + {xt ,2 - PipY 


(44) 


for i = 1, 2,..., TO, where Pq = 10^ is the signal power of the 
source, and the pair (/3i,i, /3i,2) is the position of the zth sensor. 
The covariance matrix of the measurement noise is given by 
(|45 with p = 0.035. 



In the sensor scheduling problem ( |42| ), we assume s = 
TYoYi Si = S 2 = • • • = Sm- In order to implement 

the proposed greedy algorithm and the existing method in 
ED, the nonlinear measurement function (|44]) is linearized 
at the prediction state Xt = Ft_iFt _2 • • • FqXo as suggested 
in Remark[T] We determine sensor schedules for every r = 6 
future time steps, and then update the estimate of the target 
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(a) (b) 


Fig. 5: Sensor schedules when Si = 2: (a) t = 10, (b) t = 24. 


State based on the selected measurements via an extended 
Kalman filter gg. The estimation performance is measured 
through the empirical MSE, which is obtained by averaging 
the estimation error over 30 time steps and 1000 simulation 
trials. 

In Fig.|^ we present the MSE as a function of the indi¬ 
vidual energy budget. We compare the performance of our 
proposed greedy algorithm with that of the sensor scheduling 
method in HD- We remark that the method in HD relies on 
a reformulation of linearized dynamical systems and an ii 
relaxation in optimization. In contrast, the proposed greedy 
algorithm is independent of the dynamical system models 
and convex relaxations. We observe that the greedy algorithm 
outperforms the method in pT| . This result together with the 
previous results in Eig.[Dand|^ have implied that the greedy 
algorithm could yield satisfactory estimation performance. 
Sensor schedules at time steps t — 10 and 24 are shown 
in Eig.|D We observe that some sensors closest to the target 
are selected due to their high signal power. However, from 
the entire network point of view, the active sensors tend to 
be spatially distributed rather than aggregating in a small 
neighborhood around the target. This is because observations 
from neighboring sensors are strongly correlated in space and 
may lead to information redundancy in target tracking. 

VII. Conclusion 

In this paper, we studied the problem of sensor selec¬ 
tion/scheduling with correlated measurement noise. We pro¬ 
posed a general but tractable framework to design optimal 
sensor activations. We pointed out some drawbacks of the 
existing frameworks for sensor selection with correlated noise, 
and showed that the existing formulation is valid only for the 
special case of weak noise correlation. Further, we extended 
our framework to the problem of non-myopic sensor schedul¬ 
ing, where a greedy algorithm was developed to design non- 
myopic sensor schedules. Numerical results were provided to 
illustrate the effectiveness of our approach and the impact of 
noise correlation on the performance of sensor selection. 


In future work, we will study applications of sensor selec¬ 
tion with correlated noise, such as localization in multipath 
environments, sensor collaboration in distributed estimation, 
and clock synchronization in wireless sensor networks. It 
would also be of interest to seek theoretical guarantees for 
the performance of the greedy algorithm. Furthermore, in 
order to reduce the computational burden at the fusion center, 
developing a decentralized architecture where the optimization 
procedure can be carried out in a distributed way and by the 
sensors themselves is another direction of future research. 


Appendix A 

Proof of Proposition[T] 

Given the sensor selection scheme w, it is clear from 0 
that Fisher information can be written as 




[H^,h,]R; 


-1 




R-.t, — 


R„ 


' n. 


where H^u := $u,H. 

If w 0, the inverse of R^j in (|45]l is given by 


R.i:^ = 


Ci 


— I'D — 1 I f} — I™ _T?“l 

J f -*^10 ‘-j'-j ‘■J 

-UR,-^ 1 


(45) 


(46) 


where cj := lUrjj — rjR“^rj), and cj > 0 following from 
the Schur complement of R^j. Substituting ( |46l l into ( |45] l, we 
obtain 


JiD — 4“ 1 (47) 

where J^, = -f HJR“^Hu, as indicated by ([7]i, and 
aj .= H^R^ Tj — hj. 

If w = 0, namely, we can immediately obtain 

from ( |45] l that 


^ in — 


hj hj- 


(48) 


Equations ( |47l l and ( |48| l imply that — ^ 0 since Cj > 0. 

We apply the matrix inversion lemma to (|47]i. This yields 


= [J. 


„ „ _ T-1 

CjCXjOLj \ — 




1 —t” C j OL jJfj 


cy.A 
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The improvement in estimation error is then given by 




1 ~t“ Cj GLj J w j 


Appendix B 

Proof of Proposition|2] 

Our goal is to simplify the Fisher information matrix 
given by 0 under the assumption of weak noise correlation. 
According to ( [3^ , we obtain 

R-i 

+ O(e^) (as e —^ 0) 

+ 0(e2) (as e ^ 0), (49) 

where D^, := diag(w). In ( |49l l, step (1) holds since we use 
the facts that A is a diagonal matrix and ($i„A$^)“^ = 
step (2) is obtained from the Taylor series ex¬ 
pansion (I + eX)“^ = e —)■ 0 (namely, the 

spectrum of eX is contained inside the open unit disk); step 
(3) is true since = D^j as in 0. 

Substituting ( |49l ) into Q, we obtain 

=S-i + 

+ O(e^) (as e —>■ 0) 

+ H'^(D,„A-iD,^ - eD,^A-iTA-iD„)H 
+ O(e^) (as e —>■ 0) 

+ H'^D,„(A-i - eA-iTA-^)D„H 
+ O(e^) (as e —>■ 0) 

+ H'^D,„R-iD,„H + 0{e^) (as e ^ 0) 
^=S-i + H^(ww^ o R-i)H + 0(e2) (as e ^ 0), 

where step (1) is achieved by using the fact that Du,A~^ = 
A'^Duj = DujA'^Du,, step (2) holds due to R“^ = A~^ — 
eA~^ YA~^+0(e^), and step (3) is true since D^, is diagonal 
and has only binary elements. ■ 

Appendix C 

Proof of Proposition[3] 

We begin by simplifying the objective function in ( |P2| ), 
(^(w) := tr(S~^) + tr ((ww^ o R~^)(H^H)) 

m m 

= tr(S-i) + J2Y1 ^3 

i=ij=i 

= tr(S~^) + w^riw, (50) 


where Rij is the (i, j)th entry of R“^, and Rijhfhj corre¬ 
sponds to the (i,j)th entry of 17 which yields the succinct 
form 


17 = A(R-i (g)I„)A^. (51) 


In 0 denotes the Kronecker product, A G jjmxmn 
a block-diagonal matrix whose diagonal blocks are given by 
{h^}™ and 17 ^ 0 due to R~^ ® L ^ 0. 

According to ( |50l l, ( |P2| i can be rewritten as 

maximize w^l7w 

W 

subject to l^w < s, (52) 

w e {0,1}’". 


Next, we prove that problem ( [35] l is equivalent to prob¬ 
lem ( |52l l. We recall that the former is a relaxation of the 
latter, where the former entails the maximization of a con¬ 
vex quadratic function over a bounded polyhedron P := 
{w|l'^w < s,w e [0,1]"*}. It has been shown in |46| 
that optimal solutions of such a problem occur at vertices of 
the polyhedron P, which are zero-one vectors. This indicates 
that solutions of problem ( |T5| ) are feasible for problem ( |52l l. 
Therefore, solutions of ( [T5] l are solutions of ( |52l l, and vice 
versa. ■ 
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